631 research outputs found

    Statistical Analysis of a Posteriori Channel and Noise Distribution Based on HARQ Feedback

    Full text link
    In response to a comment on one of our manuscript, this work studies the posterior channel and noise distributions conditioned on the NACKs and ACKs of all previous transmissions in HARQ system with statistical approaches. Our main result is that, unless the coherence interval (time or frequency) is large as in block-fading assumption, the posterior distribution of the channel and noise either remains almost identical to the prior distribution, or it mostly follows the same class of distribution as the prior one. In the latter case, the difference between the posterior and prior distribution can be modeled as some parameter mismatch, which has little impact on certain type of applications.Comment: 15 pages, 2 figures, 4 table

    High-Performance Matrix Multiplication: Hierarchical Data Structures, Optimized Kernel Routines, and Qualitative Performance Modeling

    Get PDF
    The optimal implementation of matrix multiplication on modern computer architectures is of great importance for scientific and engineering applications. However, achieving the optimal performance for matrix multiplication has been continuously challenged both by the ever-widening performance gap between the processor and memory hierarchy and the introduction of new architectural features in modern architectures. The conventional way of dealing with these challenges benefits significantly from the blocking algorithm, which improves the data locality in the cache memory, and from the highly tuned inner kernel routines, which in turn exploit the architectural aspects on the specific processor to deliver near peak performance. A state-of-art improvement of the blocking algorithm is the self-tuning approach that utilizes heroic combinatorial optimization of parameters spaces. Other recent research approaches include the approach that explicitly blocks for the TLB (Translation Lookaside Buffer) and the hierarchical formulation that employs memoryriendly Morton Ordering (a spaceilling curve methodology). This thesis compares and contrasts the TLB-blocking-based and Morton-Order-based methods for dense matrix multiplication, and offers a qualitative model to explain the performance behavior. Comparisons to the performance of self-tuning library and the vendor library are also offered for the Alpha architecture. The practical benchmark experiments demonstrate that neither conventional blocking-based implementations nor the self-tuning libraries are optimal to achieve consistent high performance in dense matrix multiplication of relatively large square matrix size. Instead, architectural constraints and issues evidently restrict the critical path and options available for optimal performance, so that the relatively simple strategy and framework presented in this study offers higher and flatter overall performance. Interestingly, maximal inner kernel efficiency is not a guarantee of global minimal multiplication time. Also, efficient and flat performance is possible at all problem sizes that fit in main memory, rather than jagged performance curves often observed in blocking and self-tuned blocking libraries

    Lorentz Quantum Computer

    Full text link
    A theoretical model of computation is proposed based on Lorentz quantum mechanics. Besides the standard qubits, this model has an additional bit, which we call hyperbolic bit (or hybit in short). A set of basic logical gates are constructed and their universality is proved. As an application, a search algorithm is designed for this computer model and is found to be exponentially faster than the Grover's search algorithm

    Revisiting Classifier: Transferring Vision-Language Models for Video Recognition

    Full text link
    Transferring knowledge from task-agnostic pre-trained deep models for downstream tasks is an important topic in computer vision research. Along with the growth of computational capacity, we now have open-source vision-language pre-trained models in large scales of the model architecture and amount of data. In this study, we focus on transferring knowledge for video classification tasks. Conventional methods randomly initialize the linear classifier head for vision classification, but they leave the usage of the text encoder for downstream visual recognition tasks undiscovered. In this paper, we revise the role of the linear classifier and replace the classifier with the different knowledge from pre-trained model. We utilize the well-pretrained language model to generate good semantic target for efficient transferring learning. The empirical study shows that our method improves both the performance and the training speed of video classification, with a negligible change in the model. Our simple yet effective tuning paradigm achieves state-of-the-art performance and efficient training on various video recognition scenarios, i.e., zero-shot, few-shot, general recognition. In particular, our paradigm achieves the state-of-the-art accuracy of 87.8% on Kinetics-400, and also surpasses previous methods by 20~50% absolute top-1 accuracy under zero-shot, few-shot settings on five popular video datasets. Code and models can be found at https://github.com/whwu95/Text4Vis .Comment: Accepted by AAAI-2023. Camera Ready Versio

    In-situ electrochemical fabrication of natural contacts on single nanowires

    Full text link
    We report a template-based in-situ electrochemical method for fabricating natural electric contacts on single nanowires using a pair of cross-patterned electrodes. Such electric contacts are highly stable upon thermal cycling between room temperature and milli-Kelvin temperatures. Direct imaging of the single-nanowire contacts using scanning electron microscopy is also demonstrated.Comment: 13 pages, 4 figure
    • …
    corecore